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I. INTRODUCTION 

Huge amounts of Electronic Health Records (EHRs] collected 
over the years have provided a rich base for risk analysis and 
prediction. An EHR contains digitally stored healthcare 
information about an individual, such as observations, 
laboratory tests, diagnostic reports, medications, 
procedures, patient identifying information, and allergies. A 
special type of HER is the Health Examination Records (HER] 
from annual general health check-ups. For example, 
governments such as Australia, U.K., and Taiwan , offer 
periodic geriatric health examinations as an integral part of 
their aged care programs. Since clinical care often has a 
specific problem in mind, at a point in time, only a limited 
and often small set of measures considered necessary are 
collected and stored in a persons EHR. By contrast,HERs are 
collected for regular surveillance and preventive purposes, 
covering a comprehensive set of general health measures , 
all collected at a point in time in a systematic way paper 
proposes a semi-supervised heterogeneous graph-based 
algorithm called SHG-Health (Semi supervised 
Heterogeneous Graph on Health] as an evidence-based risk 
prediction approach to mining longitudinal health 
examination records. To handle heterogeneity, it explores a 
Heterogeneous graph based on Health Examination Records 
called Hetero HER graph, where examination items in 
different categories are modelled as different types of nodes 
and their temporal relationships may be time-consuming, 
finding ways of alleviating the labelling costs is critical for 
our ability to automatically learn such models.Sheng, W 
Ruan, X Li, S Wang, Z Yang[8]proposes The health risks are 
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calculated using the information from the cause of death 
(COD] dataset that is linked to the GME dataset, a data 
mining-based method for prediction of personal health index 
based on annual geriatric medical examination records. 
Eichelberg M., Aden T., Riesmeier J., Dogac A., Laleci 
G.[10]propose introduction of electronic health records, 
boosting the efficiency of medical services at a lower cost, 
at the same time offering still a vast range of research 
challenges. In this, The analysis of the documents that were 
gathered through these terms yielded additional keywords 
and references to additional document sources. The 
following keywords or combinations were used: software, 
quality, certification, Electronic/Personal, Medical/Health 
Record, HER Standards, EHR certification 

II. REVIEW OF LITERATURE 

MF Ghalwash, V Radosavljevic, Z Obradovic[l]proposed an 
approach , a temporal data mining method is proposed for 
extracting interpretable patterns from multivariate time 
series data, which can be used to assist in providing 
interpretable early diagnosis. The problem is formulated as 
an optimization based binary classification task addressed in 
three steps, in this classification is often employed as a data 
exploration step, where summa- rization of the data in a 
target class using interpretable distinct features becomes the 
central task. To the best of our knowledge, the problem of 
extracting interpretable features for early classification on 
time series. Tran, T., Phung, D., Luo, W., Venkatesh, S[2]This 
constructs a novel ordinal regression framework for 
predicting medical risk stratification from EMR. First, a 
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conceptual view of EMR as a temporal image is constructed 
to extract a diverse set of features. Second, ordinal modeling 
is applied for predicting cumulative or progressive risk. Mas 
S Mohktar, Stephen J Redmond, Nick C Antoniades[3] 
proposed The use of telehealth technologies to remotely 
monitor patients suffering chronic diseases may enable 
preemptive treatment.As a means of detecting exacerbation 
earlier, and at the resolution of a single day, it has been 
proposed that patients with COPD might use a home 
telehealth service daily to evaluate their health status . 
Existing home telehealth services offer a range of vital sign 
monitoring modalities, for measurements including 
lungs.Jin-Mao Wei, Shu-Qin Wang,Xiao-Jie Yuan[4] proposes 
Cancer classification is the critical basis for patient-tailore 
therapy. Conventional histological analysis tends to be 
unreliable because different tumors may have similar 
appearance. Various machine learning methods can be 
employed to classify cancer tissue samples based on 
microarray data.J. Simon, Pedro J. Caraballo, Terry M. 
Therneau, Steven S[7] In this paper to maintain a EMR 
(Electronic Medical Record] and apply association rule 
mining to discover sets of risk factors and their. Association 
Rules, Survival Analysis, Association Rule Summarization. 
Yanbing Xue and Milos Hauskrecht[6] Learning of 
classification models in medicine often relies on data labeled 
by a human expert. Since labeling of clinical may be time- 
consuming, finding ways of alleviating the labeling costs is 
critical for our ability to automatically learn such 
models.Sheng, W Ruan, X Li, S Wang, Z Yang[8]proposes The 
health risks are calculated using the information from the 
cause of death (COD] dataset that is linked to the GME 
dataset, a data mining-based method for prediction of 
personal health index based on annual geriatric medical 
examination records. Eichelberg M., Aden T., Riesmeier J., 
Dogac A., Laleci G.[10]propose introduction of electronic 
health records, boosting the efficiency of medical services at 
a lower cost, at the same time offering still a vast range of 
research challenges. In this, The analysis of the documents 
that were gathered through these terms yielded additional 
keywords and references to additional document sources. 
The following keywords or combinations m were used: 
software, quality, certification, Electronic/Personal 
Medical/Health Record, HER Standards, EHR certification 

III. SYSTEM ARCHITECTURE / SYSTEM OVERVIEW 

Health risk prediction is necessary for prevention and 
proper diagnosis before disease completely developed. The 
proposed system is used efficient and robust classification 
algorithm based on live data string.the electronic health 
records is not good for live or current data because it collects 
the records on yearly basis.so, the proposed system is used 
to predict the future risk of the participants on live data 
string for prevention and early diagnosis before the disease 
completely developed. 


Live Data String: 

in this, we give live data to the system which consist of 
known and unknown symptoms, on the basis of this data the 
future risks of the participants in predicted. 

HeterogenousHER: 

A graph represents model data that is meager. To capture 
the heterogeneity naturally found in health examination 
items, we constructed a graph called HeteroHER consisting 
of multi-type nodes based on health examination 
records.health risk prediction based on health examination 
records with heterogeneity in line and large unlabeled data 
problems, we present a semi-supervised heterogeneous 
graph-based algorithm called SHG-Health. 

Semi-Supervised Learning: 

The third component of our method is a semi-supervised 
learning al- gorithm for the constructionof HeteroHER graph 
The algorithm combines the advantages of for class 
discovery and for handling heterogeneity to isolate a specific 
problem caused by evidence-based risk prediction from 
health examination records. 

Classifier: 

in the system solves the problem of unsupervised learning 
by applying semi-supervised approach.this can be done by 
maintaining graph of known and unknown symptoms.these 
graphs are given to the classifier basically.it consist of two 
types of data that is,training data and testing data.in training 
data we have to learn classier that which symptoms are 
found and what to say that disease then classier gives the 
specified prediction of risk. 

Result Analysis: 

in this section the high risk disease are analyzed on the basis 
of records obtained from classifier. 

V. ADVANTAGES 

1. the SHG-Health algorithm to handle a challenging multi¬ 
class classification problem with substantial unlabeled 
cases which may or may not belong to the known 
classes. This work pioneers in risk prediction based on 
health examination records in the presence of large 
unlabeled data. 

2. A novel graph extraction mechanism is introduced for 
handling heterogeneity found in longitudinal health 
examination records. 

3. The proposed graph-based semi-supervised learning 
algorithm SHG-Health that combines the advantages 
from heterogeneous graph learning and class discovery 
shows significant performance gain on a large and 
comprehensive real health examination dataset of 
participants as well as synthetic datasets 


IV. SYSTEM ANALYSIS 



Fig.l. Overview of the system Architecture 
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VI. CONCLUSION 

The proposed system, shows data fusion for the health 
examination records to be integrated with other types 
datasets such as hospital based electronic health records and 
participants living conditions.a SHG algo- rithm makes use of 
heteroHER and semi-supervised learning for finding various 
known and unknown symptoms in live data which is given to 
the system and predict the future risk. 
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